Introduction

Background and Motivation

Jargon is an innovative Chrome extension (Chrome Web Store, Official Website) created by my friend that transforms English web content into learning opportunities using generative AI technology. Launched in June 2024, Jargon offers two types of learning experiences: foreign language learning (Spanish, Chinese, etc.) and English style adaptation (GRE vocabulary, TikTok slang, etc.).

How Jargon Works

Customization Options

Figure 1: User Settings Interface showing customization options

Key Features

Language Selection

All types, from foreign languages like Spanish and Chinese to English variations such as TikTok Slang

Learning Goals

• Difficulty: Easy-Hard (1-10)
• Daily Target: 10-100 questions

Question Density

Controls percentage of eligible sentences (0-100%) highlighted for practice on each webpage

Display Settings

• Text Style: Highlight or underline
• Site Controls: Enable/disable per website or temporarily

Text Selection Methods

Highlight Example

Figure 2a: Highlight Style - Text appears with background color emphasis

Underline Example

Figure 2b: Underline Style - Text appears with underline emphasis

Language Transformation Examples

Question Generation Process

Figure 3: Question Generation Process - Users select text from any webpage to create practice questions

GRE Question Answer

Figure 4a: GRE Mode - Advanced vocabulary transformation

TikTok Style Question

Figure 4b: TikTok Style - Contemporary social media language

Spanish Translation

Figure 4c: Spanish Mode - English to Spanish translation

The GRE mode enhances vocabulary learning by replacing common words with their more sophisticated alternatives (e.g., “good” becomes “exemplary”), while TikTok style transforms formal English into contemporary social media expressions (e.g., “That’s cool” becomes “That’s bussin fr fr”). These AI-powered transformations maintain the original meaning while adapting to different language registers.

Research Questions and Hypotheses

After 10 months of operation and 93 users, this analysis investigates three key aspects of user behavior:

  1. Usage Context and Platform Patterns
    • Research Question: “What are the common contexts and platforms where users engage with Jargon?”
    • Hypothesis: Users primarily engage with Jargon on social media or entertainment sites and banned academic sites.
    • Rationale: Understanding where users naturally integrate Jargon into their browsing can inform platform-specific optimization and marketing strategies.
  2. Feature Adoption and User Success
    • Research Question: “What features and settings distinguish active users from occasional users?”
    • Hypothesis: Active users utilize more customization options (density settings, highlight styles) and has achievable daily goals.
    • Rationale: Identifying the features that correlate with sustained engagement can guide onboarding improvements and feature prioritization.

Methods

Data Collection

The data for this analysis was collected from Jargon’s Supabase database, covering user interactions from the extension’s launch in June 2024 through March 16, 2025. The dataset comprises five main tables:

Table 1: Overview of Dataset Components
Dataset Records Description
Profiles 92 User profiles and settings
Questions 2442 Generated practice questions
Words 1594 Vocabulary entries and translations
Levels 117 User progression through difficulty levels
Websites 27 Websites where extension was disabled

Dataset Descriptions

1. Profiles Dataset

Table 2: Key Variables in Profiles Dataset
Variable Type Description Notes
user_id Primary Key Unique identifier for each user Anonymized identifier
level Integer Current proficiency level Range: 1-10
paused Boolean Extension status on Chrome TRUE/FALSE (Default: TRUE)
chrome_notifs Boolean Notification preferences TRUE/FALSE
language String Current selected language mode e.g., ‘GRE Vocabulary’, ‘TikTok Slang’
last_question_time DateTime Timestamp of most recent question UTC timezone
week_streak Integer Consecutive weeks of activity
daily_streak Integer Consecutive days of activity
daily_progress Integer Questions completed today Resets daily
daily_goal Integer Target questions per day User-set goal
density Integer Frequency of questions Percentage of eligible sentences shown (0-100)
highlightStyle String Text selection preference ‘highlight’ or ‘underline’

2. Questions Dataset

Table 3: Key Variables in Questions Dataset
Variable Type Description Notes
question_id Primary Key Unique question identifier
user_id Foreign Key Associated user References profiles
created_at DateTime Question generation time UTC timezone
sentence Text Original selected text English source content
word String Target word for learning
language String Transformation mode Selected language mode
original_sentence Text Source text Pre-transformation content
options_array Array of String Multiple choice options Even indices: options in target language; Odd indices: English translations
answered_at DateTime Completion timestamp NULL if unanswered
chosen_option String User’s answer NULL if unanswered
user_rating Integer Question quality rating Feature not yet implemented

3. Words Dataset

Table 4: Key Variables in Words Dataset
Variable Type Description Notes
created_at DateTime Word entry timestamp UTC timezone
word String Target vocabulary
language String Language mode
user_id Foreign Key Associated user References profiles
translation Text English translation AI-generated translation
status String Learning status Currently all set to ‘learning’

4. Levels Dataset

Table 5: Key Variables in Levels Dataset
Variable Type Description Notes
user_id Foreign Key Associated user References profiles
language String Language mode
level Integer Difficulty level Range: 1-10

5. Websites Dataset

Table 6: Key Variables in Websites Dataset
Variable Type Description Notes
user_id Foreign Key Associated user References profiles
website String Blocked URL Sites where Jargon is disabled

Data Processing

Data Cleaning Steps

Profile Enhancement

  • Aggregated user activity metrics from various tables
  • Created derived engagement metrics:
    • Total questions generated
    • Questions answered
    • Number of blocked websites
    • Unique difficulty levels attempted
  • Handled missing values by replacing NAs with 0 for count-based metrics

Derived Variables

Table 7: Overview of Derived Variables
Variable Calculation Purpose
generated_questions Count of questions per user Measure overall engagement
answered_questions Count of questions with answers Measure learning completion
blocked_sites Count of blocked websites Understand avoidance patterns
levels_attempted Count of unique combination of languages and difficulty levels Track learning progression

Data Exploration

Our exploratory data analysis examines patterns that inform both research questions about usage context and feature adoption. We organize our exploration into four main categories:

1. Platform and Website Interaction Patterns

Figure 5: Website Usage Analysis - Distribution of blocked websites by category (left) and frequency of individual websites (right)

The analysis of blocked websites reveals distinct patterns in how users interact with the Jargon extension. Professional tools—particularly Salesforce and AI platforms—are the most frequently blocked, suggesting that users tend to avoid using Jargon during work-related activities. The presence of development environment blocks indicates that some users are technical professionals, though this group represents only a modest portion of the overall user base. Educational content also features prominently among blocked websites, with users often disabling the extension on documentation sites and learning platforms, possibly to maintain focus during concentrated study sessions.

However, it is important to note that there are only 27 blocked sites across 92 users. This limited usage suggests that the blocking feature is not widely utilized, and the current data may not be conclusive. Caution should be exercised when generalizing these findings, as they may not fully represent the broader user population.

2. Language Mode and Feature Usage

Figure 6: Scatter plot showing the relationship between user adoption and question generation across different language modes

The scatter plot highlights key patterns in language mode usage:

  • Spanish is the most active mode, with the highest number of questions (~800) and users (~30).
  • GlizzyTalk and Tamil show moderate engagement (~300 questions each).
  • Korean and GRE Vocabulary form a middle tier (~200 questions).
  • Most other languages have low adoption, with fewer users and questions.
  • Some modes (e.g., Tamil) have high question counts despite fewer users, indicating intensive use by dedicated learners.

Overall, while usage intensity and adoption vary widely across languages, traditional language learning modes drive most activity.

Figure 7: Word frequency analysis showing common words (left) and word pairs (right) in learning content. Colors indicate frequency of occurrence, with darker shades representing higher frequencies.

Insights from Word and Phrase Frequency Analysis (based on the English original sentences selected for content generation):

  • The most common words and word pairs (e.g., “currents,” “ice,” “churn,” “concentric,” “ice form,” “churn water”) suggest that users frequently select technical or scientific content for practice, possibly from educational or informational sources. Descriptive and Process-Oriented Language:
  • Many frequent terms describe physical processes or states (e.g., “breeze,” “rolls,” “floating ball,” “gentle churn”), indicating an emphasis on dynamic or descriptive language in the learning material.
  • The recurrence of similar words and phrases (e.g., “form,” “water”) implies that certain concepts or topics are repeatedly practiced, which may reflect user interests or the nature of the source material.

Overall, the word frequency analysis reveals that users are engaging most with scientific and descriptive content, focusing on process-oriented vocabulary and recurring technical terms.

3. Temporal and Engagement Patterns

Figure 8: Daily activity patterns showing question generation and active users with their respective averages (dashed lines) over the observation period, based on UTC timezone. Questions average: 12.5 per day; Users average: 2.2 per day.

Figure 9: Weekly activity patterns showing average questions generated and active users by day of week (UTC timezone), with error bars indicating standard error.

The temporal analysis reveals several key patterns in user engagement, based on both daily and weekly activity (all timestamps in UTC):

  • Daily Trends:: Question generation and active user counts fluctuate considerably day-to-day, with occasional spikes (up to 200 questions or 12 users), but most days remain below the average (12.5 questions, 2.2 users).This indicates a small but steady user base, with 1–5 active users on most days.

  • Weekly Trends: Question generation is highest on Mondays, Tuesdays, and Wednesdays, then tapers off toward the weekend,suggesting users are more engaged during the workweek. There is substantial variability across days, as shown by the error bars.

Together, these patterns indicate that Jargon’s usage is characterized by low but regular engagement, with activity peaking midweek and significant day-to-day variability. This suggests a core group of users who interact with the platform most during the workweek.

4. User Engagement Distribution

Figure 10: Distribution of key engagement metrics across users, showing individual violin plots for each metric with median and interquartile range (IQR) statistics. Each plot uses a distinct color and includes summary statistics.

The violin plots provide a clearer view of the distribution of user engagement metrics:

  • Generated Questions & Answered Questions: Most users generate and answer only a small number of questions, as shown by the wide base near zero. A few users are highly active, producing a long tail of outliers with much higher counts.
  • Blocked Sites: The vast majority of users do not block any sites (distribution concentrated at zero), with only a handful blocking more than one site.
  • Levels Attempted: Most users attempt only one level, with very few exploring multiple levels. The distribution is sharply peaked at one, with a small tail for higher values.

Overall, the violin plots highlight that engagement is highly skewed: most users interact minimally, while a small subset are much more active or exploratory. This pattern is consistent across all four metrics.

Further Analysis

Research Question 1: Usage Context and Platform Patterns

Sentiment Analysis of User-Selected Content

To further address our first research question—“What are the common contexts and platforms where users engage with Jargon?”—we performed sentiment analysis on the English original sentences that users selected for learning. Using the syuzhet package in R, each sentence was assigned a sentiment score, where positive values indicate positive sentiment, negative values indicate negative sentiment, and values near zero indicate neutral sentiment. This approach allows us to quantitatively assess the emotional tone of the content users choose to engage with.

Figure 11: Stacked bar graph showing the frequency of user-selected sentences in each sentiment category, stacked by language mode (top 5 languages shown in color, all others in grey).

The stacked bar graph shows the overall distribution of sentiment categories, with the top 5 language modes highlighted in color and all other languages grouped in grey. This visualization highlights both the predominance of neutral and slightly negative content and the relative engagement of different language modes—including less common ones—across sentiment categories.

Topic Modeling of User-Selected Content (LDA)

To further explore the contexts in which users engage with Jargon, we applied Latent Dirichlet Allocation (LDA) topic modeling to the English original sentences selected by users. In addition to standard stopwords, we removed a custom list of common or uninformative words to improve topic quality. This method uncovers the main themes or topics present in the content users choose to learn from.

Figure 12: Top terms for each topic identified by LDA topic modeling of user-selected sentences. Each panel shows the most important words for one topic, with x-axis numbering visible for all.

The LDA topic modeling did not yield strong or actionable insights about the contexts or platforms where users engage with Jargon. The “Importance (beta)” values are all quite low (well below 0.05), which is typical for LDA on short texts or small datasets, but it also means that no single word dominates any topic. The topics identified are diffuse, with mostly generic or process-oriented terms. This suggests that either the user-selected content is too varied or generic for topic modeling to be effective, or that the dataset is not large or rich enough for LDA to find meaningful structure. This is a valid finding: not all analyses reveal clear patterns, and reporting this transparently demonstrates scientific rigor. It may also indicate that user engagement with Jargon is broad and not easily categorized, or that more data is needed for deeper insights.

Despite the weak themes, a tentative interpretation of the topics is as follows:

  • Topic 1: May relate to work processes or technical tasks (e.g., “work,” “parsing,” “incremental,” “curious”).
  • Topic 2: Appears to focus on scientific or physical phenomena, especially related to water and movement (e.g., “form,” “water,” “ice,” “currents,” “breeze”).
  • Topic 3: Suggests group actions or collective activities (e.g., “together,” “collect,” “balls,” “patterns”).
  • Topic 4: Includes terms that could relate to data, viewing, or content creation (e.g., “view,” “number,” “stack,” “videos,” “write”).

However, these interpretations are tentative due to the low importance values and the generic nature of the terms.

Research Question 2: Feature Adoption and User Success